Reinforcement Learning of Multi-Issue Negotiation Dialogue Policies
نویسندگان
چکیده
We use reinforcement learning (RL) to learn a multi-issue negotiation dialogue policy. For training and evaluation, we build a hand-crafted agenda-based policy, which serves as the negotiation partner of the RL policy. Both the agendabased and the RL policies are designed to work for a large variety of negotiation settings, and perform well against negotiation partners whose behavior has not been observed before. We evaluate the two models by having them negotiate against each other under various settings. The learned model consistently outperforms the agenda-based model. We also ask human raters to rate negotiation transcripts between the RL policy and the agenda-based policy, regarding the rationality of the two negotiators. The RL policy is perceived as more rational than the agenda-based policy.
منابع مشابه
Reinforcement Learning of Two-Issue Negotiation Dialogue Policies
We use hand-crafted simulated negotiators (SNs) to train and evaluate dialogue policies for two-issue negotiation between two agents. These SNs differ in their goals and in the use of strong and weak arguments to persuade their counterparts. They may also make irrational moves, i.e., moves not consistent with their goals, to generate a variety of negotiation patterns. Different versions of thes...
متن کاملReinforcement Learning of Argumentation Dialogue Policies in Negotiation
We build dialogue system policies for negotiation, and in particular for argumentation. These dialogue policies are designed for negotiation against users of different cultural norms (individualists, collectivists, and altruists). In order to learn these policies we build simulated users (SUs), i.e. models that simulate the behavior of real users, and use Reinforcement Learning (RL). The SUs ar...
متن کاملSingle-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies
We use single-agent and multi-agent Reinforcement Learning (RL) for learning dialogue policies in a resource allocation negotiation scenario. Two agents learn concurrently by interacting with each other without any need for simulated users (SUs) to train against or corpora to learn from. In particular, we compare the Qlearning, Policy Hill-Climbing (PHC) and Win or Learn Fast Policy Hill-Climbi...
متن کاملLearning Culture-Specific Dialogue Models from Non Culture-Specific Data
We build culture-specific dialogue policies of virtual humans for negotiation and in particular for argumentation and persuasion. In order to do that we use a corpus of non-culture specific dialogues and we build simulated users (SUs), i.e. models that simulate the behavior of real users. Then using these SUs and Reinforcement Learning (RL) we learn negotiation dialogue policies. Furthermore, w...
متن کاملReinforcement Learning of Multi-Party Trading Dialog Policies
Trading dialogs are a kind of negotiation in which an exchange of ownership of items is discussed, and these kinds of dialogs are pervasive in many situations. Recently, there has been an increasing amount of research on applying reinforcement learning (RL) to negotiation dialog domains. However, in previous research, the focus was on negotiation dialog between two participants only, ignoring c...
متن کامل